Parsing Hindi with MDParser

نویسندگان

Alexander Volokh

Günter Neumann

چکیده

We describe our participation in the MTPIL Hindi Parsing Shared Task-2012. Our system achieved the following results: 82.44% LAS/90.91% UAS (auto) and 85.31% LAS/92.88% UAS (gold). Our parser is based on the linear classification, which is suboptimal as far as the accuracy is concerned. The strong point of our approach is its speed. For parsing development the system requires 0.935 seconds, which corresponds to a parsing speed of 1318 sentences per second. The Hindi Treebank contains much less different part of speech tags than many other treebanks and therefore it was absolutely necessary to use the additional morphosyntactic features available in the treebank. We were able to build classifiers predicting those, using only the standard word form and part of speech features, with a high accuracy.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Combining Deterministic Dependency Parsing and Linear Classification for Robust RTE

We present a robust RTE approach which is built as one module incorporating all possible knowledge sources in form of different features. This way we can easily include or remove knowledge sources which are involved into the process of judging the entailment relation. We perform numerous tests in which we analyse the contribution of different types of features based on word forms, structural in...

متن کامل

Explicit Argument Identification for Discourse Parsing In Hindi: A Hybrid Pipeline

Shallow discourse parsing enables us to study discourse as a coherent piece of information rather than a sequence of clauses, sentences and paragraphs. In this paper, we identify arguments of explicit discourse relations in Hindi. This is the first such work carried out for Hindi. Building upon previous work carried out on discourse connective identification in Hindi, we propose a hybrid pipeli...

متن کامل

Exploiting Language Variants Via Grammar Parsing Having Morphologically Rich Information

In this paper, the development and evaluation of the Urdu parser is presented along with the comparison of existing resources for the language variants Urdu/Hindi. This parser was given a linguistically rich grammar extracted from a treebank. This context free grammar with sufficient encoded information is comparable with the state of the art parsing requirements for morphologically rich and cl...

متن کامل

Bidirectional Dependency Parser for Hindi, Telugu and Bangla

This paper describes the dependency parser we used in the NLP Tools Contest, 2009 for parsing Hindi, Bangla and Telugu. The parser uses a bidirectional parsing algorithm with two operations proj and non-proj to build the dependency tree. The parser obtained Labeled Attachment Score of 71.63%, 59.86% and 67.74% for Hindi, Telugu and Bangla respectively on the treebank with fine-grained dependenc...

متن کامل

Context Based Statistical Morphological Analyzer and its Effect on Hindi Dependency Parsing

This paper revisits the work of (Malladi and Mannem, 2013) which focused on building a Statistical Morphological Analyzer (SMA) for Hindi and compares the performance of SMA with other existing statistical analyzer, Morfette. We shall evaluate SMA in various experiment scenarios and look at how it performs for unseen words. The later part of the paper presents the effect of the predicted morph ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

Parsing Hindi with MDParser

نویسندگان

چکیده

منابع مشابه

Combining Deterministic Dependency Parsing and Linear Classification for Robust RTE

Explicit Argument Identification for Discourse Parsing In Hindi: A Hybrid Pipeline

Exploiting Language Variants Via Grammar Parsing Having Morphologically Rich Information

Bidirectional Dependency Parser for Hindi, Telugu and Bangla

Context Based Statistical Morphological Analyzer and its Effect on Hindi Dependency Parsing

عنوان ژورنال:

اشتراک گذاری